Setup

First, install some new packages.

if(!require(tmap)){install.packages("tmap")}
if(!require(sf)){install.packages("sf")}
if(!require(maps)){install.packages("maps")}

Then, activate the ones we need.

library(tidyverse)
library(readxl)
library(tmap)        # Geographical plotting
library(gapminder)   # Country level data
library(maps)        # Geographical datasets
library(sf)          # Shape files (sf) are a common geographical format 
library(plotly)      # Dynamic graphs

Now, load the data (included in the packages). The World dataset contains geographical data.

data("World")
str(World)           # structure of World
Classes ‘sf’ and 'data.frame':  177 obs. of  16 variables:
 $ iso_a3      : Factor w/ 177 levels "AFG","AGO","ALB",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ name        : Factor w/ 177 levels "Afghanistan",..: 1 4 2 166 6 7 5 56 8 9 ...
 $ sovereignt  : Factor w/ 171 levels "Afghanistan",..: 1 4 2 159 6 7 5 52 8 9 ...
 $ continent   : Factor w/ 8 levels "Africa","Antarctica",..: 3 1 4 3 8 3 2 7 6 4 ...
 $ area        : Units: [km^2] num  652860 1246700 27400 71252 2736690 ...
 $ pop_est     : num  28400000 12799293 3639453 4798491 40913584 ...
 $ pop_est_dens: num  43.5 10.3 132.8 67.3 15 ...
 $ economy     : Factor w/ 7 levels "1. Developed region: G7",..: 7 7 6 6 5 6 6 6 2 2 ...
 $ income_grp  : Factor w/ 5 levels "1. High income: OECD",..: 5 3 4 2 3 4 2 2 1 1 ...
 $ gdp_cap_est : num  784 8618 5993 38408 14027 ...
 $ life_exp    : num  59.7 NA 77.3 NA 75.9 ...
 $ well_being  : num  3.8 NA 5.5 NA 6.5 4.3 NA NA 7.2 7.4 ...
 $ footprint   : num  0.79 NA 2.21 NA 3.14 2.23 NA NA 9.31 6.06 ...
 $ inequality  : num  0.427 NA 0.165 NA 0.164 ...
 $ HPI         : num  20.2 NA 36.8 NA 35.2 ...
 $ geometry    :sfc_MULTIPOLYGON of length 177; first list element: List of 1
  ..$ :List of 1
  .. ..$ : num [1:69, 1:2] 5310471 5408503 5470647 5477147 5541607 ...
  ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "names")= chr [1:15] "iso_a3" "name" "sovereignt" "continent" ...
data("gapminder")
head(gapminder)

The world data is very rich: it has many attributes (population, economy, etc.). The one column we are interested in is geometry: it contains the data for the maps!

We see that a few points are missing for life expectancy, so let’s see if we can complete with the gapminder data (spoiler: not so much!).

This exercise (see below) of grouping two datasets is paramount when plotting geographical content because usually, the geometric properties come from one file and the items we wish to plot come from another file. Thus, the joining step is KEY.

Areas

Maps in which areas are coloured to show some numerical attribute are called choropleths.

The idea below is to merge the two sources to combine geographical forms and life expectancy. To do that, we need left_join(). But before, we need to create a common variable (i.e., a key) between the two. In World, the term “name” is not so great so let’s change it into “country” (just like in gapminder!). Then we can merge!

names(World)[2] <- "country"                    # Change the column name
gapminder %>% left_join(World, by = "country")  # Join the two datasets
nrow(gapminder)                                 # Number of instances

There are missing points. As a rough approximation, let’s just remove them and store the result in data.

data <- gapminder %>%                     # Join and save via the arrow operator <- 
  left_join(World, by = "country") %>%
  na.omit()                               # Remove rows with missing data
nrow(data)                                # Number of instances in dataset
[1] 1296

The loss is large because of missing points! We then use the tmap package to create a choropleth.

data %>% 
  filter(year == 2007) %>%    # Keep only one year (problems otherwise)
  sf::st_sf() %>%             # Transform data into shape file (sf) format
  tm_shape() +                # Tranforms the sf into a tmap object
  tm_polygons("lifeExp")      # Plot!

Some countries are missing => data problem because gapminder is incomplete. Let’s complete it via: https://www.gapminder.org/data/ This yields a much bigger dataset, called “gap2.RData”. Let’s see:

load("gap2.RData")
head(gap2)

It’s highly tidy! Let’s un-tidy it a bit to put it in the original gapminder format.

gap2 <- gap2 %>% pivot_wider(names_from = "indicator", values_from = "value")

Again, we resort to left_join to merge the two data sources.

data <- World %>% 
  left_join(gap2, by = "country") #%>%
  #na.omit() 

Next, we move towards interactive maps with the leaflet package. The advantage of leaflet lies in its multiple options that create dynamic plots. The example below is strongly inspired from the reference page: https://rstudio.github.io/leaflet/choropleths.html Thus, looking at the difference between the two codes helps understand how they work.

We are going two use two features:
- a customized color scale;
- labels that appear when the cursor is on the map.

For the scale, we define it manually, both for the colors and for the separation points (bins).
The labels are defined using html functions.

if(!require(leaflet)){install.packages("leaflet")}                  # Install package
library(leaflet)                                                    # Load package
datamap <- data %>% filter(year == 2011)                            # Keep only one year (recent)
palet <- colorBin("YlGnBu",                                         # Yellow-Green-Blue palette
                  domain = datamap %>% pull(lifeExp), # LifeExp     # Domain of labels: lifeExp
                  bins = c(50, 55, 60, 65, 70, 80, 90))             # Age categories

labels <- sprintf(                                                  # Below we define the labels
  "<strong>%s</strong><br/>%g Years",                               # Adding text to label
  datamap$country,                                                  # We show the country name...
  datamap$lifeExp                                                   # ... and the life expectancy
) %>% lapply(htmltools::HTML)                                       # Embedded all into html language

We can then move forward with the plot.

data %>% 
  filter(year == 2011) %>%                        # Keep the one year of data
  data.frame() %>%                                # Turn into dataframe (technical)
  sf::st_sf() %>%                                 # Format in sf
  st_transform("+init=epsg:4326") %>%             # Convert in particular coordinate reference 
  leaflet() %>%                                   # Call leaflet
  addPolygons(fillColor = ~palet(lifeExp),        # Create the map (colored polygons)
              weight = 2,                         # Width of separation line
              opacity = 1,                        # Opacity of separation line
              color = "white",                    # Color of separation line
              dashArray = "3",                    # Dash size of separation line
              fillOpacity = 0.7,                  # Opacity of polygon colors
              highlight = highlightOptions(       # 5 lines below control the cursor impact
                weight = 2,                       # Width of line
                color = "#0B3790",                # Color of line
                dashArray = "",                   # No dash
                fillOpacity = 0.7,                # Opacity
                bringToFront = TRUE),
              label = labels,                     # LABEL! Defined above!
              labelOptions = labelOptions(        # Label options below...
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto")
  ) %>%
  addLegend(pal = palet,                    # Legend: comes from palet colors defined above
            values = ~lifeExp,              # Values come from lifeExp variable
            opacity = 0.9,                  # Opacity of legend
            title = "Map Legend",           # Title of legend
            position = "bottomright")       # Position of legend

Better, but some African countries are still missing… Data problem!

It is possible to integrate leaflet into shiny: https://rstudio.github.io/leaflet/shiny.html You need renderLeaflet() in the server and leafletOutput() in the UI.

Points

Points are more easy to handle because they require less information (two coordinates at least plus other attributes like size or color). This is exactly the kind of data type that fits in the ggplot syntax! Below, we show how to combine ggplot with geographical data.

Country level data

While areas must be defined at the country level (complex!), points are simply characterized by two numbers: latitude & longitude. So a list of these two features suffices to plot points. For countries, it is possible to find data online, ex: https://developers.google.com/public-data/docs/canonical/countries_csv

The latitudes and longitudes correspond to the center of the countries.

Below, we create a new dataset with longitudes & latitutes combined with gapminder (updated version).

if(!require(readxl)){install.packages("readxl")}    # Install package to read excel files
library(readxl)                                     # Activate package
options(scipen=999)                                 # Remove scientific notation
country_LL <- read_excel("country_LL.xlsx")         # Read longitude/latitude data
data2 <- left_join(gap2, country_LL) %>%            # Merge two datasets
  na.omit() %>%                                     # Remove missing points
  mutate(latitude = as.numeric(latitude),           # 'Numerize' latitude
         longitude = as.numeric(longitude),         # 'Numerize' longitude
         population = round(population / 10^6,1))   # Scaling population into millions
data2 %>% head()

Be careful, the code below is computationally demanding (be patient). We use the viridis color palette. In ggplot, a small list of maps is available: world, France, Italy, New Zealand, US (and states).

if(!require(scales)){install.packages("scales")}
if(!require(viridis)){install.packages("viridis")} # Package for color gradient
library(viridis)                                    
library(scales) 
world <- map_data("world")
g <- ggplot(data = world) +                               # Data source
  geom_polygon(data = world, aes(x = long,                # Dark background of the world
                                 y = lat, 
                                 group = group)) +   
  geom_point(data = data2 %>% filter(year == 2018),       # Most recent data points
             aes(x = longitude,                           # Classical ggplot syntax!
                 y = latitude, 
                 size = population, 
                 color = lifeExp, 
                 label = country)) +                      # Label (for plotly)
  scale_color_viridis(direction = -1)                                   # Color scale
ggplotly(g)                                               # Plotly integration

City level data

You can find longitudes & latitudes by searching on Google. Here’s one example for France: https://simplemaps.com/data/fr-cities Worldwide data can be accessed here: https://simplemaps.com/data/world-cities

cities <- read_excel("worldcities.xlsx")

That’s too big. Let’s focus on Germany, Italy, Austria, Slovenia and Switzerland (arbitrarily).

cities_short <- cities %>% 
  filter(country %in% c("Germany", "Italy", "Switzerland", "Austria", "Slovenia"),
         population > 250000) %>%
  na.omit()
head(cities_short)
world <- map_data("world")
g <- ggplot(data = world) +                                     # Data source
  geom_polygon(data = world,                                    # Dark background of the world
               aes(x = long, y = lat, group = group),
               size = 0.5, color = "grey") +                    # Outer lines of polygons
  geom_point(data = cities_short,                               # The points
             aes(x = lng, 
                 y = lat, 
                 size = population, 
                 color = country, 
                 label = city)) +
  #scale_colour_gradient(low = "#FFFB00", high = "#FF1B00") +  # This line is for color = population
  scale_colour_manual(values = c("#FFEC00", "#66E234", "#1BCBF7", "#CC44F8", "#FC2361")) +
  coord_sf(xlim = c(5, 18), ylim = c(36, 57), expand = FALSE)             # Zoom on Europe
ggplotly(g)

Other maps

For points, it’s easy: just use the same coordinate system via longitude & latitude. It’s much more tricky for areas.

The key is to find shape files (one alternative format is GeoJSON, but it is not treated here).
Example: http://datapages.com/gis-map-publishing-program/gis-open-files/global-framework/ Be very careful: some maps are EXTREMELY precise (up to 1m precision). Thus, the files are heavy and the plotting takes a LOT of time. Below, we use 100m precision, which is largely enough.

The full description of all shape file items can be accessed on wikipedia:

  • .shp — shape format; the feature geometry itself
  • .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
  • .dbf — attribute format; columnar attributes for each shape, in dBase IV format
regions <- sf::st_read("regions-20140306-100m.shp")
Reading layer `regions-20140306-100m' from data source `/Users/coqueret/Documents/IT/COURS/2021/R4DS/S7_extensions/Geocomputing/regions-20140306-100m.shp' using driver `ESRI Shapefile'
Simple feature collection with 27 features and 10 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: -61.80976 ymin: -21.38973 xmax: 55.83665 ymax: 51.08984
geographic CRS: WGS 84
str(regions)    # Structure of the variable region
Classes ‘sf’ and 'data.frame':  27 obs. of  11 variables:
 $ code_insee: chr  "42" "72" "83" "25" ...
 $ nom       : chr  "Alsace" "Aquitaine" "Auvergne" "Basse-Normandie" ...
 $ nom_cl    : chr  "Strasbourg" "Bordeaux" "Clermont-Ferrand" "Caen" ...
 $ insee_cl  : chr  "67482" "33063" "63113" "14118" ...
 $ nuts2     : chr  "FR42" "FR61" "FR72" "FR25" ...
 $ iso3166_2 : chr  NA NA NA NA ...
 $ wikipedia : chr  "fr:Alsace" "fr:Aquitaine" "fr:Auvergne" "fr:Basse-Normandie" ...
 $ nb_dep    : num  2 5 4 3 4 4 6 4 2 4 ...
 $ nb_comm   : num  904 2296 1310 1812 2046 ...
 $ surf_km2  : num  8328 41818 26172 17786 31752 ...
 $ geometry  :sfc_MULTIPOLYGON of length 27; first list element: List of 1
  ..$ :List of 1
  .. ..$ : num [1:760, 1:2] 7.43 7.43 7.42 7.42 7.4 ...
  ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA
  ..- attr(*, "names")= chr [1:10] "code_insee" "nom" "nom_cl" "insee_cl" ...

That’s complicated. The only important column for now is the name of the region which is easy to recognize. Below, we plot a choropleth where the color is link to the area of the region (bigger = darker).

regions %>% 
  tm_shape() +
  tm_polygons("surf_km2")

That’s prety ugly. Let’s remove the regions outside the mainland.

regions %>%
  filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>% 
  tm_shape() +
  tm_polygons("surf_km2", border.col = "white") # With the border of polygons

Below, we change the color palette and plot the number of French “départements” inside each region.

regions %>%
  filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>% 
  tm_shape() +
  tm_fill("nb_dep", palette = "Spectral")  # Without the border

This is static: leaflet integration would be better.

Highcharts

The material below is inspired from the vignette https://cran.r-project.org/web/packages/highcharter/vignettes/charting-maps.html

The highcharter package creates good looking plots. Let’s install it (and activate it).

if(!require(highcharter)){install.packages("highcharter")}
library(highcharter)

The package has a large library of maps. The list is given here: https://code.highcharts.com/mapdata/

If you click on a link, you get the address of the map:

The last part is what you specify to get access to a map:

mapdata <- get_data_from_map(download_map_data("countries/fr/custom/fr-all-mainland"))
trying URL 'https://code.highcharts.com/mapdata/countries/fr/custom/fr-all-mainland.js'
Content type 'text/javascript' length 23676 bytes (23 KB)
==================================================
downloaded 23 KB
head(mapdata)

The information that will be plotted comes from this file. We need to add the value that we want to use for the plot. There are 21 regions in the variable map_data. This bypasses the merging of external data via the left_join() function.

mapdata <- mapdata %>% mutate(value = 1:nrow(mapdata)) # Completely random values
head(mapdata,20)

Below, we plot the regions of France and the color code is driven by the column “value” defined above.

hcmap("countries/fr/custom/fr-all-mainland",                      # Source for the map
      data = mapdata,                                             # Source for the color code
      value = "value",                                            # Column for color code
      joinBy = c("name"),                                         # Common name & label
      name = "Random Plot",                                       # Name of plot
      dataLabels = list(enabled = TRUE, format = "{point.name}"), # Allow labels
      borderColor = "#FAFAFA", borderWidth = 0.5,                 # Border info
      tooltip = list(valueDecimals = 1,                           # Label info
                     valuePrefix = "Value = ",                    # Before label
                     valueSuffix = " Units"))                     # After label
trying URL 'https://code.highcharts.com/mapdata/countries/fr/custom/fr-all-mainland.js'
Content type 'text/javascript' length 23676 bytes (23 KB)
==================================================
downloaded 23 KB

Fine grain local maps

Now let’s turn to smaller scale maps. The data comes from Kaggle, scrapped from tripadvisor. The useful columns: Longitude & Latitude, the MuseumName, the Rating and the LengthOfVisit.

With leaflet, it’s easy to obtain empty maps as long as you know the longitude & latitude of a particular location. Below: Lyon.

tripadvisor <- read_excel("tripadvisor_museum_US.xlsx") %>%    # Importing the data
  mutate(Longitude = as.numeric(Longitude),                    # Getting numbers back
         Latitude = as.numeric(Latitude),
         Rating = as.numeric(Rating),
         LengthOfVisit = as.factor(LengthOfVisit)) %>%
  filter(Latitude > 40, Latitude < 41, Longitude > (-74.5), Longitude < (-73.5)) # NY museums only
Problem with `mutate()` input `Longitude`.
ℹ NAs introduced by coercion
ℹ Input `Longitude` is `as.numeric(Longitude)`.NAs introduced by coercionProblem with `mutate()` input `Latitude`.
ℹ NAs introduced by coercion
ℹ Input `Latitude` is `as.numeric(Latitude)`.NAs introduced by coercion
tripadvisor

leaflet() %>%                                                  # An empty map: a good start!
  setView(lng = 4.85, lat = 45.75, zoom = 12) %>% 
  addTiles()
pal <- colorFactor(palette = "Spectral", domain = tripadvisor$LengthOfVisit)
tripadvisor %>%
  leaflet() %>%                                                  # A blank sheet...
  setView(lng = -73.9772, lat = 40.7808, zoom = 10) %>% 
  addTiles() %>%
  addCircles(lng = ~Longitude, lat = ~Latitude, 
             radius = ~Rating^4, label =  ~MuseumName,
             fillColor = ~pal(LengthOfVisit), fillOpacity = 0.9,  # Circle colors
             stroke = TRUE, color = "black", weight = 2) %>%      # Circle stroke
  addLegend(pal = pal,                    # Legend: comes from palet colors defined above
            values = ~LengthOfVisit,              # Values come from lifeExp variable
            opacity = 0.7,                  # Opacity of legend
            title = "Map Legend",           # Title of legend
            position = "bottomright")       # Position of legend

Below, you need to specify lon and lat in the source dataframe. Or you can use the hcaes() function

mapdata2 <- get_data_from_map(download_map_data("countries/us/custom/us-ny-congress-113"))
trying URL 'https://code.highcharts.com/mapdata/countries/us/custom/us-ny-congress-113.js'
Content type 'text/javascript' length 23596 bytes (23 KB)
==================================================
downloaded 23 KB
hcmap("countries/us/custom/us-ny-congress-113") %>%
  hc_mapNavigation(enabled = TRUE,
                   buttonOptions = list(
                     align = "right",
                     verticalAlign = "bottom")) %>%
  hc_add_series(data = tripadvisor %>% mutate(lon = Longitude, lat = Latitude, name = MuseumName),  # Add correct names
                type = "mappoint", name = "Museums") 
trying URL 'https://code.highcharts.com/mapdata/countries/us/custom/us-ny-congress-113.js'
Content type 'text/javascript' length 23596 bytes (23 KB)
==================================================
downloaded 23 KB

NA

Not as impressive as leaflet, but the map could probably be improved. NOTE: in the example above, the geographical background and the points are unrelated.

---
title: "Geocomputing with R"
output:
  html_notebook:
    toc: true
    toc_float: true
---


# Setup

First, install some new packages.


```{r, message = FALSE, warning = FALSE}
if(!require(tmap)){install.packages("tmap")}
if(!require(sf)){install.packages("sf")}
if(!require(maps)){install.packages("maps")}
```


Then, activate the ones we need.

```{r, message = FALSE, warning = FALSE}
library(tidyverse)
library(readxl)
library(tmap)        # Geographical plotting
library(gapminder)   # Country level data
library(maps)        # Geographical datasets
library(sf)          # Shape files (sf) are a common geographical format 
library(plotly)      # Dynamic graphs
```


Now, load the data (included in the packages). The World dataset contains geographical data. 

```{r}
data("World")
str(World)           # structure of World
data("gapminder")
head(gapminder)
```

The world data is very rich: it has many attributes (population, economy, etc.). The one column we are interested in is **geometry**: it contains the data for the maps!

We see that a few points are missing for life expectancy, so let's see if we can complete with the gapminder data (spoiler: not so much!).

**This exercise (see below) of grouping two datasets is paramount when plotting geographical content because usually, the geometric properties come from one file and the items we wish to plot come from another file. Thus, the joining step is KEY.**

# Areas

Maps in which areas are coloured to show some numerical attribute are called *choropleths*.

The idea below is to merge the two sources to combine geographical forms and life expectancy. To do that, we need **left_join**(). But before, we need to create a common variable (i.e., a *key*) between the two. In World, the term "name" is not so great so let's change it into "country" (just like in gapminder!). Then we can merge!

```{r, message = FALSE, warning = FALSE}
names(World)[2] <- "country"                    # Change the column name
gapminder %>% left_join(World, by = "country")  # Join the two datasets
nrow(gapminder)                                 # Number of instances
```

There are missing points. As a rough approximation, let's just remove them and store the result in **data**. 

```{r, message = FALSE, warning = FALSE}
data <- gapminder %>%                     # Join and save via the arrow operator <- 
  left_join(World, by = "country") %>%
  na.omit()                               # Remove rows with missing data
nrow(data)                                # Number of instances in dataset
```

The loss is large because of missing points! We then use the *tmap* package to create a choropleth.


```{r}
data %>% 
  filter(year == 2007) %>%    # Keep only one year (problems otherwise)
  sf::st_sf() %>%             # Transform data into shape file (sf) format
  tm_shape() +                # Tranforms the sf into a tmap object
  tm_polygons("lifeExp")      # Plot!
```


Some countries are missing => data problem because gapminder is incomplete.
Let's complete it via: https://www.gapminder.org/data/
This yields a much bigger dataset, called "gap2.RData". Let's see:

```{r}
load("gap2.RData")
head(gap2)
```

It's highly tidy! Let's un-tidy it a bit to put it in the original *gapminder* format. 

```{r}
gap2 <- gap2 %>% pivot_wider(names_from = "indicator", values_from = "value")
```

Again, we resort to left_join to merge the two data sources.

```{r, message = FALSE, warning = FALSE}
data <- World %>% 
  left_join(gap2, by = "country") #%>%
  #na.omit() 
```


Next, we move towards interactive maps with the *leaflet* package. The advantage of *leaflet* lies in its multiple options that create dynamic plots. The example below is strongly inspired from the reference page: https://rstudio.github.io/leaflet/choropleths.html
Thus, looking at the difference between the two codes helps understand how they work. 

We are going two use two features:   
- a customized color scale;    
- labels that appear when the cursor is on the map.

For the scale, we define it manually, both for the colors and for the separation points (bins).   
The labels are defined using html functions.


```{r, message = FALSE, warning = FALSE}
if(!require(leaflet)){install.packages("leaflet")}                  # Install package
library(leaflet)                                                    # Load package
datamap <- data %>% filter(year == 2011)                            # Keep only one year (recent)
palet <- colorBin("YlGnBu",                                         # Yellow-Green-Blue palette
                  domain = datamap %>% pull(lifeExp), # LifeExp     # Domain of labels: lifeExp
                  bins = c(50, 55, 60, 65, 70, 80, 90))             # Age categories

labels <- sprintf(                                                  # Below we define the labels
  "<strong>%s</strong><br/>%g Years",                               # Adding text to label
  datamap$country,                                                  # We show the country name...
  datamap$lifeExp                                                   # ... and the life expectancy
) %>% lapply(htmltools::HTML)                                       # Embedded all into html language
```

We can then move forward with the plot.

```{r, message = FALSE, warning = FALSE}
data %>% 
  filter(year == 2011) %>%                        # Keep the one year of data
  data.frame() %>%                                # Turn into dataframe (technical)
  sf::st_sf() %>%                                 # Format in sf
  st_transform("+init=epsg:4326") %>%             # Convert in particular coordinate reference 
  leaflet() %>%                                   # Call leaflet
  addPolygons(fillColor = ~palet(lifeExp),        # Create the map (colored polygons)
              weight = 2,                         # Width of separation line
              opacity = 1,                        # Opacity of separation line
              color = "white",                    # Color of separation line
              dashArray = "3",                    # Dash size of separation line
              fillOpacity = 0.7,                  # Opacity of polygon colors
              highlight = highlightOptions(       # 5 lines below control the cursor impact
                weight = 2,                       # Width of line
                color = "#0B3790",                # Color of line
                dashArray = "",                   # No dash
                fillOpacity = 0.7,                # Opacity
                bringToFront = TRUE),
              label = labels,                     # LABEL! Defined above!
              labelOptions = labelOptions(        # Label options below...
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto")
  ) %>%
  addLegend(pal = palet,                    # Legend: comes from palet colors defined above
            values = ~lifeExp,              # Values come from lifeExp variable
            opacity = 0.9,                  # Opacity of legend
            title = "Map Legend",           # Title of legend
            position = "bottomright")       # Position of legend
```


Better, but some African countries are still missing... Data problem!

It is possible to integrate *leaflet* into *shiny*: https://rstudio.github.io/leaflet/shiny.html
You need **renderLeaflet**() in the server and **leafletOutput**() in the UI.


# Points

Points are more easy to handle because they require less information (two coordinates at least plus other attributes like size or color). This is exactly the kind of data type that fits in the **ggplot** syntax! Below, we show how to combine ggplot with geographical data.

## Country level data

While areas must be defined at the country level (complex!), points are simply characterized by two numbers: **latitude** & **longitude**. So a list of these two features suffices to plot points. For countries, it is possible to find data online, ex: https://developers.google.com/public-data/docs/canonical/countries_csv

The latitudes and longitudes correspond to the center of the countries.

Below, we create a new dataset with longitudes & latitutes combined with *gapminder* (updated version).
```{r, message = FALSE, warning = FALSE}
if(!require(readxl)){install.packages("readxl")}    # Install package to read excel files
library(readxl)                                     # Activate package
options(scipen=999)                                 # Remove scientific notation
country_LL <- read_excel("country_LL.xlsx")         # Read longitude/latitude data
data2 <- left_join(gap2, country_LL) %>%            # Merge two datasets
  na.omit() %>%                                     # Remove missing points
  mutate(latitude = as.numeric(latitude),           # 'Numerize' latitude
         longitude = as.numeric(longitude),         # 'Numerize' longitude
         population = round(population / 10^6,1))   # Scaling population into millions
data2 %>% head()
```

Be careful, the code below is computationally demanding (be patient). We use the *viridis* color palette. 
In *ggplot*, a small list of maps is available: world, France, Italy, New Zealand, US (and states). 

```{r, message = FALSE, warning = FALSE}
if(!require(scales)){install.packages("scales")}
if(!require(viridis)){install.packages("viridis")} # Package for color gradient
library(viridis)                                    
library(scales) 
world <- map_data("world")
g <- ggplot(data = world) +                               # Data source
  geom_polygon(data = world, aes(x = long,                # Dark background of the world
                                 y = lat, 
                                 group = group)) +   
  geom_point(data = data2 %>% filter(year == 2018),       # Most recent data points
             aes(x = longitude,                           # Classical ggplot syntax!
                 y = latitude, 
                 size = population, 
                 color = lifeExp, 
                 label = country)) +                      # Label (for plotly)
  scale_color_viridis()                                   # Color scale
ggplotly(g)                                               # Plotly integration
```

## City level data

You can find longitudes & latitudes by searching on Google. Here's one example for France:
https://simplemaps.com/data/fr-cities
Worldwide data can be accessed here: https://simplemaps.com/data/world-cities

```{r, message = FALSE, warning = FALSE}
cities <- read_excel("worldcities.xlsx")
```

That's too big. Let's focus on Germany, Italy, Austria, Slovenia and Switzerland (arbitrarily).

```{r}
cities_short <- cities %>% 
  filter(country %in% c("Germany", "Italy", "Switzerland", "Austria", "Slovenia"),
         population > 250000) %>%
  na.omit()
head(cities_short)
```

```{r, message = FALSE, warning = FALSE}
world <- map_data("world")
g <- ggplot(data = world) +                                     # Data source
  geom_polygon(data = world,                                    # Dark background of the world
               aes(x = long, y = lat, group = group),
               size = 0.5, color = "grey") +                    # Outer lines of polygons
  geom_point(data = cities_short,                               # The points
             aes(x = lng, 
                 y = lat, 
                 size = population, 
                 color = country, 
                 label = city)) +
  #scale_colour_gradient(low = "#FFFB00", high = "#FF1B00") +  # This line is for color = population
  scale_colour_manual(values = c("#FFEC00", "#66E234", "#1BCBF7", "#CC44F8", "#FC2361")) +
  coord_sf(xlim = c(5, 18), ylim = c(36, 57), expand = FALSE)             # Zoom on Europe
ggplotly(g)
```


# Other maps

For points, it's easy: just use the same coordinate system via longitude & latitude. It's much more tricky for areas.

The key is to find **shape files** (one alternative format is GeoJSON, but it is not treated here).  
Example: http://datapages.com/gis-map-publishing-program/gis-open-files/global-framework/ 
Be very careful: some maps are *EXTREMELY* precise (up to 1m precision). Thus, the files are heavy and the plotting takes a *LOT* of time. Below, we use 100m precision, which is largely enough.

The full description of all shape file items can be accessed on wikipedia:

* .shp — shape format; the feature geometry itself   
* .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly   
* .dbf — attribute format; columnar attributes for each shape, in dBase IV format

```{r, message = FALSE, warning = FALSE}
regions <- sf::st_read("regions-20140306-100m.shp")
str(regions)    # Structure of the variable region
```

That's complicated. The only important column for now is the *name* of the region which is easy to recognize. Below, we plot a choropleth where the color is link to the area of the region (bigger = darker).

```{r}
regions %>% 
  tm_shape() +
  tm_polygons("surf_km2")
```

That's prety ugly. Let's remove the regions outside the mainland. 

```{r}
regions %>%
  filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>% 
  tm_shape() +
  tm_polygons("surf_km2", border.col = "white") # With the border of polygons
```

Below, we change the color palette and plot the number of French "départements" inside each region.

```{r}
regions %>%
  filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>% 
  tm_shape() +
  tm_fill("nb_dep", palette = "Spectral")  # Without the border
```

This is static: *leaflet* integration would be better.

# Highcharts

The material below is inspired from the vignette https://cran.r-project.org/web/packages/highcharter/vignettes/charting-maps.html

The *highcharter* package creates good looking plots. Let's install it (and activate it).

```{r, message = FALSE, warning = TRUE}
if(!require(highcharter)){install.packages("highcharter")}
library(highcharter)
```

The package has a large library of maps. The list is given here:
https://code.highcharts.com/mapdata/

If you click on a link, you get the address of the map:

![](high_list.png)

The last part is what you specify to get access to a map:

```{r, message = FALSE, warning = FALSE}
mapdata <- get_data_from_map(download_map_data("countries/fr/custom/fr-all-mainland"))
head(mapdata)
```

The information that will be plotted comes from this file. We need to add the value that we want to use for the plot. There are 21 regions in the variable *map_data*. This bypasses the merging of external data via the **left_join**() function.

```{r, message = FALSE, warning = FALSE}
mapdata <- mapdata %>% mutate(value = 1:nrow(mapdata)) # Completely random values
head(mapdata,20)
```

Below, we plot the regions of France and the color code is driven by the column "value" defined above. 

```{r, message = FALSE, warning = FALSE}
hcmap("countries/fr/custom/fr-all-mainland",                      # Source for the map
      data = mapdata,                                             # Source for the color code
      value = "value",                                            # Column for color code
      joinBy = c("name"),                                         # Common name & label
      name = "Random Plot",                                       # Name of plot
      dataLabels = list(enabled = TRUE, format = "{point.name}"), # Allow labels
      borderColor = "#FAFAFA", borderWidth = 0.5,                 # Border info
      tooltip = list(valueDecimals = 1,                           # Label info
                     valuePrefix = "Value = ",                    # Before label
                     valueSuffix = " Units"))                     # After label
```


# Fine grain local maps

Now let's turn to smaller scale maps. The data comes from Kaggle, scrapped from tripadvisor. The useful columns: *Longitude* & *Latitude*, the *MuseumName*, the *Rating* and the *LengthOfVisit*. 

With leaflet, it's easy to obtain empty maps as long as you know the longitude & latitude of a particular location. Below: **Lyon**.

```{r}
tripadvisor <- read_excel("tripadvisor_museum_US.xlsx") %>%    # Importing the data
  mutate(Longitude = as.numeric(Longitude),                    # Getting numbers back
         Latitude = as.numeric(Latitude),
         Rating = as.numeric(Rating),
         LengthOfVisit = as.factor(LengthOfVisit)) %>%
  filter(Latitude > 40, Latitude < 41, Longitude > (-74.5), Longitude < (-73.5)) # NY museums only
tripadvisor

leaflet() %>%                                                  # An empty map: a good start!
  setView(lng = 4.85, lat = 45.75, zoom = 12) %>% 
  addTiles()
```

```{r}
pal <- colorFactor(palette = "Spectral", domain = tripadvisor$LengthOfVisit)
tripadvisor %>%
  leaflet() %>%                                                  # A blank sheet...
  setView(lng = -73.9772, lat = 40.7808, zoom = 10) %>% 
  addTiles() %>%
  addCircles(lng = ~Longitude, lat = ~Latitude, 
             radius = ~Rating^4, label =  ~MuseumName,
             fillColor = ~pal(LengthOfVisit), fillOpacity = 0.9,  # Circle colors
             stroke = TRUE, color = "black", weight = 2) %>%      # Circle stroke
  addLegend(pal = pal,                    # Legend: comes from palet colors defined above
            values = ~LengthOfVisit,              # Values come from lifeExp variable
            opacity = 0.7,                  # Opacity of legend
            title = "Map Legend",           # Title of legend
            position = "bottomright")       # Position of legend
```

Below, you need to specify **lon** and **lat** in the source dataframe. Or you can use the *hcaes*() function

```{r}
mapdata2 <- get_data_from_map(download_map_data("countries/us/custom/us-ny-congress-113"))
hcmap("countries/us/custom/us-ny-congress-113") %>%
  hc_mapNavigation(enabled = TRUE,
                   buttonOptions = list(
                     align = "right",
                     verticalAlign = "bottom")) %>%
  hc_add_series(data = tripadvisor %>% mutate(lon = Longitude, lat = Latitude, name = MuseumName),  # Add correct names
                type = "mappoint", name = "Museums") 
  
```

Not as impressive as leaflet, but the map could probably be improved. 
**NOTE**: in the example above, the geographical background and the points are unrelated.

# Resources

Below, a link of great pages & tutorials:
 
**Geocomputation with R** (online book): https://geocompr.robinlovelace.net    
**ggplot**!: https://ggplot2.tidyverse.org/reference/map_data.htmlPure     
**sf** (shape files): https://cran.r-project.org/web/packages/sf/vignettes/sf1.html   
**sf & R**: https://r-spatial.github.io/sf/articles/sf1.html   
**sf & ggplot**: https://www.r-spatial.org/r/2018/10/25/ggplot2-sf-2.html   
**list of sf**: http://datapages.com/gis-map-publishing-program/gis-open-files/global-framework/global-heat-flow-database/shapefiles-list  
**tmap**: https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html   
**ggmap**: https://github.com/dkahle/ggmap    
**leaflet**: https://rstudio.github.io/leaflet/    
**Oregon** (tutorial): http://geog.uoregon.edu/bartlein/courses/geog495/lec06.html    


```{r}

```

